##Overview

##GOALS

##WHY

##OUTLINE

Questions we have answered:

1.Do the streaming platforms gear their content towards a certain age group more than others?

2.Does an “A list” actor always guarantee a higher rating on IMDb? Do some platforms have more content with “A list” actors?

3.Are movies and TV shows always improving its quality through the time? (See average rate from two platforms by year)

4.Do certain platforms contain more movies in specific genres than other genres?

5.Do certain streaming platforms have more movies vs tv shows or vice versa?

##DATA

Loading Packages

Getting Data

Description of data: Our primary dataset contains information regarding Movies and TV shows on four different streaming platforms. X: row number ID: Unique TV show ID Title: name of TV shows Year: the year in which the tv show was produced Age: target age group IMDb: IMDb rating Rotten.Tomatoes: Rotten Tomatoes rating Netflix: whether the tv show is found on Netflix Hulu: whether the tv show is found on Hulu Prime.Video: whether the tv show is found on Prime Video Disney.: Whether the tv show is found on Disney Type: Movie or TV show. Move is 0, TV show is 1. The IMDb database provided us with the information regarding the names of actors in each film and show and the genre each falls into, and the Top 1000 Actors dataset provided us with the information needed to tell whether a film or show contained any A-list actors.

tv_show <- read.csv(file = "tv_shows.csv") %>%
  separate(IMDb, c("IMDb", "I-fullRate"), sep = "/") %>%
  separate(Rotten.Tomatoes, c("Rotten.Tomatoes", "R-fullRate"), sep = "/") %>%
  mutate(IMDb = as.numeric(IMDb)/10,
         Rotten.Tomatoes = as.numeric(Rotten.Tomatoes)/100) %>%
  select(c(-1,-2,-7,-9)) %>% mutate(MT = "TV")
head(tv_show)
movie <- read.csv(file = "MoviesOnStreamingPlatforms.csv") %>%
  separate(Rotten.Tomatoes, c("Rotten.Tomatoes", "R-fullRate"), sep = "/") %>%
  mutate(Rotten.Tomatoes = as.numeric(Rotten.Tomatoes)/100) %>%
  select(-1,-2,-7) %>% mutate(MT = "Movie")
head(movie)
imdb  <- read.csv("imdb_database.csv") %>%
  rename(Title=Movie.Name)
head(imdb)
Alist <- read.csv(file = "Top 1000 Actors and Actresses.csv") %>%
  select(Name, Known.For) %>% rename(Title=Known.For)
head(Alist)

Cleanning Data

Analysis

  1. Do the streaming platforms gear their content towards a certain age group more than others?
  • Disney has the most “family friendly” content with most of their content being for all ages or 13+, whereas there are very few content geared towards ages 16+ or 18+.
  • Netflix has the most content geared towards individuals 18+, so it might not make as much sense for families with very young children to have a Netflix subscription vs a Disney subscription, however Netflix does have more content for younger ages than platforms Hulu and Prime Video.
  • Prime Video has content mostly 18+ similar to Netflix, so for individuals that are older, or families with older children it would make more sense to own a Prime Video subscription, but there isn’t as much content in this category as Netflix.
  • Hulu has content mostly geared towards ages 16+ and 18+, so once again it would not make as much sense for families with young children to have a Hulu subscription as it would for these families to have a Disney subscription. Hulu has the least content for 13+ and all ages than all the other platforms.
  1. Does an “A list” actor always guarantee a higher rating on IMDb?
##                Estimate  Std. Error   t value      Pr(>|t|)
## (Intercept) -0.49327589 0.017979887 -27.43487 8.134353e-157
## Score        0.09901313 0.002700458  36.66531 1.772742e-267
## [1] 1.960332
## 
##  Welch Two Sample t-test
## 
## data:  Alist_movie$Score and Alist_movie$is_Alist
## t = 331.1, df = 7172.3, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  6.307471      Inf
## sample estimates:
## mean of x mean of y 
##  6.488097  0.149131
##               Estimate  Std. Error   t value      Pr(>|t|)
## (Intercept) -0.8422401 0.029672686 -28.38436 6.739653e-162
## Score        0.1583109 0.004120424  38.42103 6.136821e-276
## [1] 1.960546
## 
##  Welch Two Sample t-test
## 
## data:  Alist_TV$Score and Alist_TV$is_Alist
## t = 284.95, df = 4844.1, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  6.739794      Inf
## sample estimates:
## mean of x mean of y 
## 7.0533072 0.2743753
## Platform
##      Disney        Hulu     Netflix Prime.Video 
##         793        1774        3985        3974

Null Hypothesis: A movie with an A list actor rates the same as a movie/TV show without an A list actor.

Alt Hypothesis: A movie with an A list actor rates higher than a movie/TV show without an A list actor.

  • Based on the linear regression and comparing the t-score to the critical t-score, we can conclude that this is evidence that there is a statistical significance when a movie/TV show that contains an A-list actor generates more popularity than a movie/TV show without one.
  1. Are movies and TV shows always improving its quality through the time? (See average rate from two platforms by year)
  • The dataset provides only the Rotten Tomato scores for movies, but both the IMDb scores and Rotten Tomato scores for TV Shows.
  • When looking at the graphic comparing the years the movies and TV shows came out vs the Rotten Tomatoes rating we can see between a bit before 1950 through right before 2000 we see a slight increase in Rotten Tomato scores, but a decrease through the present. However, this is just a general trend as there are way too many movies and shows to be able to draw an absolute conclusion about ratings in this way.
  • The scores, especially through the 2000s, seem to be very spread out with some movies and shows receiving extremely high scores and some receiving extremely low scores. This definitely affects the way that the smoother gets placed in the graphic and makes it difficult to actually say that there was a definitive decrease in Rotten Tomato scoring over time.
  • The graphic showing just the Rotten Tomato scores of movies still seems to be way too large of a sample to draw any definitive conclusions, but the graphic showing the average Rotten Tomato scores of movies in each year shows a bit more. We can see that over time the average scores of movies that came out fluctuated a bit in their scores. It does not seem that the average scores of movies has consistently increased or decreased over time. It looks as though from 1914 until right before 1950 the average Rotten Tomato score decreased, then increased until around 2000, and has since been in a decline.
  • The dataset provided both the IMDb and Rotten Tomato scores for TV shows. Looking at the graphics showing the IMDb scores and Rotten Tomato scores of all TV shows we still cannot draw a direct conclusion about the trends of scores over time as the sample size is just too large. Looking at the graphics showing the average IMDb scores and average Rotten Tomato scores of TV shows there is no consistent trend of scores increasing or decreasing consistently. Both graphics look to increase and decrease within the same years which is interesting since the scoring methods used for IMDb and Rotten Tomato scores are different.
  • IMDb scores are based on users of IMDb submitting a score and a review of the movie, and Rotten Tomato scores are based on critics that are approved by the Rotten Tomato creators. Overall, it would seem IMDb scores are a bit more accurate as anyone can submit a review not jut a critic that has been approved.
  • Overall, it seems as though the average rating of scores for both movies and TV shows do not follow a specific pattern of increasing or decreasing, but rather there are some years where scores increase and some where they decrease.
  1. Do certain platforms contain more movies in specific genres than other genres?

  • Disney has more family, adventure, animation movies. It is obvious that the kinds of movie on Disney’s platform has a more evenly distributed than other platforms.
  • Netflix, Prime Video and Hulu all have more drama, comedy movies than any other kinds of movies.
  • As we can see different platforms have different genres represented more or less heavily through the movies that are available. However, we must take this with a grain of salt, since certain genres of movies may be more popular than others and some movies can fall into more than one genre. For example, we see comedy and drama are among the most highly shown genres on all platforms, but these two genres have more movies in them than genres like war or western overall, so it would make sense to see these genres more represented among the platforms since more movies are available in these two genres than all others.
  1. Do certain streaming platforms have more movies vs tv shows or vice versa?
  • Disney, Netflix, and Prime Video have more movies than TV shows, Hulu has more TV shows than movies.